An Initialization Method for the K-means Algorithm using RNN and Coupling Degree
نویسندگان
چکیده
Since K-means is widely used for general clustering, its performance is a critical point. This performance depends highly on initial cluster centers since it may converge to numerous local minima. In this paper a proposed initialization method to select initial cluster centers for K-means clustering is proposed. This algorithm is based on reverse nearest neighbor (RNN) search and coupling degree. Reverse nearest neighbor search retrieves all points in a given data set whose nearest neighbor is a given query point, where coupling degree between neighborhoods of nodes is defined based on the neighborhood-based rough set model as the amount of similarity between objects. The initial cluster centers computed using this methodology are found to be very close to the desired cluster centers for iterative clustering algorithms. The application of the proposed algorithm to K-means clustering algorithm is demonstrated. An experiment is carried out on several popular datasets and the results show the advantages of the proposed method. General Terms Artificial intelligence, Data clustering, partitioning methods, Kmeans Initialization methods
منابع مشابه
Improved COA with Chaotic Initialization and Intelligent Migration for Data Clustering
A well-known clustering algorithm is K-means. This algorithm, besides advantages such as high speed and ease of employment, suffers from the problem of local optima. In order to overcome this problem, a lot of studies have been done in clustering. This paper presents a hybrid Extended Cuckoo Optimization Algorithm (ECOA) and K-means (K), which is called ECOA-K. The COA algorithm has advantages ...
متن کاملA Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS
Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...
متن کاملModified Convex Data Clustering Algorithm Based on Alternating Direction Method of Multipliers
Knowing the fact that the main weakness of the most standard methods including k-means and hierarchical data clustering is their sensitivity to initialization and trapping to local minima, this paper proposes a modification of convex data clustering in which there is no need to be peculiar about how to select initial values. Due to properly converting the task of optimization to an equivalent...
متن کاملEnhanced k - means Clustering Algorithm by Abdessalam H . Elhabbash Supervisor : Prof . Hatem Hamad
Data clustering is an unsupervised classification method aims at creating groups of objects, or clusters, in such a way that objects in the same cluster are very similar and objects in different clusters are quite distinct. Though k-means is very popular for general clustering, it suffers from some disadvantages such as (1) Its performance depends highly on initial cluster centers, (2) The numb...
متن کاملPersistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm
Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011